Combining Unsupervised Variable Selection with Dimensionality Reduction

نویسنده

  • Lior Wolf
چکیده

This paper bridges the gap between variable selection methods (e.g Pearson coefficients, KS test) and dimensionality reduction algorithms (e.g PCA, LDA). Variable selection algorithms encounter difficulties dealing with highly correlated data, as many features are similar in quality. Dimensionality reduction algorithms tend to combine all variables, and are not able to select significant variables out of a set of features. Our approach combines both methodologies by applying variable selection followed by dimensionality reduction. The key point is to optimize the same utility function in both stages. The resulting algorithm is able to benefit from complex features as variable selection algorithms do, and at the same time enjoy the benefits of dimensionality reduction.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Exploring the Gap Between Variable Selection and Dimensionality Reduction

The Problem: This project addresses the gap between variable selection algorithms and dimensionality reduction algorithms. Variable selection algorithms are designed to produce sparse solutions where only few variable are marked as relevant variables. This is not suitable for highly correlated data such as gray values of an image. Dimensionality reduction algorithms (e.g PCA) tend to combine al...

متن کامل

Relevance Analysis of Stochastic Biosignals for Identification of Pathologies

This paper presents a complementary study of the methodology for diagnosing of pathologies, based on relevance analysis of stochastic (time-variant) features that are extracted from t-f representations of biosignal recordings. Dimension reduction is carried out by adapting in time commonly used latent variable techniques for a given relevance function, as evaluation measure of time-variant tran...

متن کامل

A Monte Carlo-Based Search Strategy for Dimensionality Reduction in Performance Tuning Parameters

Redundant and irrelevant features in high dimensional data increase the complexity in underlying mathematical models. It is necessary to conduct pre-processing steps that search for the most relevant features in order to reduce the dimensionality of the data. This study made use of a meta-heuristic search approach which uses lightweight random simulations to balance between the exploitation of ...

متن کامل

Probabilistic Additive Component Analysis A Latent Variable Model for Dimensionality Reduction of Human Functional Magnetic Resonance Images

In recent years, an important new application of machine learning research has emerged from the field of cognitive neuroscience. In ‘mind-reading’ experiments, a machine learning classifier is trained to predict aspects of a human subject’s mental state from patterns of brain activity recorded by in a functional MRI (fMRI) scanner. However, a typical fMRI dataset consists of relatively few, noi...

متن کامل

Steel Consumption Forecasting Using Nonlinear Pattern Recognition Model Based on Self-Organizing Maps

Steel consumption is a critical factor affecting pricing decisions and a key element to achieve sustainable industrial development. Forecasting future trends of steel consumption based on analysis of nonlinear patterns using artificial intelligence (AI) techniques is the main purpose of this paper. Because there are several features affecting target variable which make the analysis of relations...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004